21 research outputs found

    Systems for AutoML Research

    Get PDF

    Genetic automated machine learning assistant

    Get PDF
    GAMA is an AutoML package for end-users and AutoML researchers. It uses genetic programming to efficiently generate optimized machine learning pipelines given specific input data and resource constraints. A machine learning pipeline contains data preprocessing as well as a machine learning algorithm, with fine-tuned hyperparameter settings. Document type: Articl

    Meta-Learning for Symbolic Hyperparameter Defaults

    Get PDF
    Hyperparameter optimization in machine learning (ML) deals with the problem of empirically learning an optimal algorithm configuration from data, usually formulated as a black-box optimization problem. In this work, we propose a zero-shot method to meta-learn symbolic default hyperparameter configurations that are expressed in terms of the properties of the dataset. This enables a much faster, but still data-dependent, configuration of the ML algorithm, compared to standard hyperparameter optimization approaches. In the past, symbolic and static default values have usually been obtained as hand-crafted heuristics. We propose an approach of learning such symbolic configurations as formulas of dataset properties from a large set of prior evaluations on multiple datasets by optimizing over a grammar of expressions using an evolutionary algorithm. We evaluate our method on surrogate empirical performance models as well as on real data across 6 ML algorithms on more than 100 datasets and demonstrate that our method indeed finds viable symbolic defaults.Comment: Pieter Gijsbers and Florian Pfisterer contributed equally to the paper. V1: Two page GECCO poster paper accepted at GECCO 2021. V2: The original full length paper (8 pages) with appendi

    GAMA: genetic automated machine learning assistant

    No full text
    GAMA is an AutoML package for end-users and AutoML researchers. It uses genetic programming to efficiently generate optimized machine learning pipelines given specific input data and resource constraints. A machine learning pipeline contains data preprocessing as well as a machine learning algorithm, with fine-tuned hyperparameter settings

    An open source AutoML benchmark

    No full text
    In recent years, an active field of research has developed around automated machine learning(AutoML). Unfortunately, comparing different AutoML systems is hard and often doneincorrectly. We introduce an open, ongoing, and extensible benchmark framework whichfollows best practices and avoids common mistakes. The framework is open-source, usespublic datasets and has a website with up-to-date results. We use the framework to conducta thorough comparison of 4 AutoML systems across 39 datasets and analyze the results

    Visual exploration of migration patterns in gull data

    No full text
    We present a visual analytics approach to explore and analyze movement data as collected by ecologists interested in understanding migration. Migration is an important and intriguing process in animal ecology, which may be better understood through the study of tracks for individuals in their environmental context. Our approach enables ecologists to explore the spatio-temporal characteristics of such tracks interactively. It identifies and aggregates stopovers depending on a scale at which the data is visualized. Statistics of stopover sites and links between them are shown on a zoomable geographic map which allows to interactively explore directed sequences of stopovers from an origin to a destination. In addition, the spatio-temporal properties of the trajectories are visualized by means of a density plot on a geographic map and a calendar view. To evaluate our visual analytics approach, we applied it on a data set of 75 migrating gulls that were tracked over a period of 3 years. The evaluation by an expert user confirms that our approach supports ecologists in their analysis workflow by helping to identifying interesting stopover locations, environmental conditions or (groups of) individuals with characteristic migratory behavior, and allows therefore to focus on visual data analysis

    OpenML-Python: an extensible Python API for OpenML

    Get PDF
    OpenML is an online platform for open science collaboration in machine learning, used to share datasets and results of machine learning experiments. In this paper, we introduce OpenML-Python, a client API for Python, which opens up the OpenML platform for a wide range of Python-based machine learning tools. It provides easy access to all datasets, tasks and experiments on OpenML from within Python. It also provides functionality to conduct machine learning experiments, upload the results to OpenML, and reproduce results which are stored on OpenML. Furthermore, it comes with a scikit-learn extension and an extension mechanism to easily integrate other machine learning libraries written in Python into the OpenML ecosystem. Source code and documentation are available at https://github.com/openml/openml-python/
    corecore